Skip to content

test(opentelemetry): Avoid wallclock race in startSpan child end-time test#21045

Merged
mydea merged 1 commit into
developfrom
fix/opentelemetry-flaky-startSpan-child-end-time
May 20, 2026
Merged

test(opentelemetry): Avoid wallclock race in startSpan child end-time test#21045
mydea merged 1 commit into
developfrom
fix/opentelemetry-flaky-startSpan-child-end-time

Conversation

@mydea
Copy link
Copy Markdown
Member

@mydea mydea commented May 20, 2026

Summary

  • Bumps the nowSec offset from +1 to +10 seconds in the "converts seconds to milliseconds for startSpan child span" test.

Root cause

From the failing run:

AssertionError: expected 1000000 to be +0 // Object.is equality
> 2197 |     expect(capturedEndTime![1]).toBe(0);

The test does:

const nowSec = Math.floor(Date.now() / 1000) + 1;
startSpan({ name: 'outer' }, () => {
  const innerSpan = startInactiveSpan({ name: 'inner' });
  innerSpan.end(nowSec);
  capturedEndTime = getSpanEndTime(innerSpan);
});
expect(capturedEndTime![1]).toBe(0);

If the wallclock crosses the next-second boundary between computing nowSec and creating innerSpan (more likely here than in sibling tests because of the extra outer + callback overhead), innerSpan.startTime lands at [nowSec, ~N ms in nanos]after the requested endTime = [nowSec, 0]. OTel's Span.end then detects negative duration and clamps:

// @opentelemetry/sdk-trace-base / Span.js
if (this._duration[0] < 0) {
  this.endTime = this.startTime.slice();
  this._duration = [0, 0];
}

After the clamp, endTime[1] equals startTime[1] — the wallclock nanos, not 0. The reported 1000000 ns is exactly the case where setup ran ~1 ms into the next second.

Bumping the buffer to +10 s makes the clamp branch unreachable during normal test execution. Sibling tests using +1 only create a single span, so they finish well before the boundary.

Fixes #20962

🤖 Generated with Claude Code

… test

The "converts seconds to milliseconds for startSpan child span" test
computed `nowSec = Math.floor(Date.now()/1000) + 1` and then created
both an outer `startSpan` and an inner `startInactiveSpan` before
calling `innerSpan.end(nowSec)`. When the surrounding setup pushed
wallclock past `nowSec*1000`, OTel's `Span.end` saw `endTime < startTime`,
hit the `if (duration[0] < 0)` clamp in `sdk-trace-base/Span.js`, and
overwrote endTime with startTime — leaving endTime[1] equal to the
sub-second nanos of startTime (1000000 in the reported failure) instead
of the expected 0.

Bumping the offset to +10s gives the test enough headroom that the
clamp branch can no longer trip during normal execution. Sibling tests
in the same describe block use +1 but do not flake in practice because
they create only a single span.

Fixes #20962

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@mydea mydea requested a review from a team as a code owner May 20, 2026 08:44
@mydea mydea merged commit 3df59a6 into develop May 20, 2026
425 of 427 checks passed
@mydea mydea deleted the fix/opentelemetry-flaky-startSpan-child-end-time branch May 20, 2026 09:32
mydea added a commit that referenced this pull request May 20, 2026
…stamp tests (#21061)

## Summary

Closes #20901. Closes #20851.

PR #21045 fixed `converts seconds to milliseconds for startSpan child
span` by bumping its `+1`s buffer to `+10`s. The same wallclock-boundary
race affects the three sibling tests in the same describe block:

- `converts seconds to milliseconds for startInactiveSpan` ← flake
#20901
- `converts seconds to milliseconds for startSpanManual callback span` ←
flake #20851
- `handles HrTime input for startInactiveSpan` — no flake report yet,
but vulnerable to the same clamp; if the wallclock rolls past `nowSec *
1000` ms before span creation, OTel clamps `endTime` to `startTime` and
`endTime[1]` is no longer `500_000_000`. Bumped for consistency.

If `Date.now()` is in the last few microseconds of a second when `nowSec
= Math.floor(Date.now() / 1000) + 1` is computed, span creation can
cross the boundary so the span's `startTime` lands at `[nowSec, ~N ns]`
— *after* the requested `endTime = [nowSec, 0]`. OTel's `Span.end` then
triggers:

```js
if (this._duration[0] < 0) {
  this.endTime = this.startTime.slice();
  this._duration = [0, 0];
}
```

…copying the startTime nanos onto `endTime[1]`, breaking the `toBe(0)`
(or `toBe(500_000_000)`) assertion.

Bumping the buffer to `+10`s puts the requested `endTime` safely past
the span's start time so the clamp can't fire under normal test
execution.

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Flaky CI]: Node (22) Unit Tests - test/trace.test.ts > span.end() timestamp conversion > converts seconds to milliseconds for startSpan child span

2 participants